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ABSTRACT 

Homogeneity and isotropy of the universe at sufficiently large scales is a fundamental premise 
on which modern cosmology is based. Fractal dimensions of matter distribution is a parameter 
that can be used to test the hypothesis of homogeneity. In this method, galaxies are used as 
tracers of the distribution of matter and samples derived from various galaxy redshift surveys 
have been used to determine the scale of homogeneity in the Universe. Ideally, for homogene- 
ity, the distribution should be a mono-fractal with the fractal dimension equal to the ambient 
dimension. While this ideal definition is true for infinitely large point sets, this may not be 
realised as in practice, we have only a finite point set. The correct benchmark for realistic 
data sets is a homogeneous distribution of a finite number of points and this should be used 
in place of the mathematically defined fractal dimension for infinite number of points [D] as 
a requirement for approach towards homogeneity. We derive the expected fractal dimension 
for a homogeneous distribution of a finite number of points. We show that for sufficiently 
large data sets the expected fractal dimension approaches D in absence of clustering. It is 
also important to take the weak, but non-zero amplitude of clustering at very large scales into 
account. In this paper we also compute the expected fractal dimension for a finite point set 
that is weakly clustered. Clustering introduces departures in the Fractal dimensions from D 
and in most situations the departures are small if the amplitude of clustering is small. Features 
in the two point correlation function, like those introduced by Baryon Acoustic Oscillations 
(BAO) can lead to non-trivial variations in the Fractal dimensions where the amplitude of 
clustering and deviations from D are no longer related in a monotonic manner. We show that 
in the concordance model, the fractal dimension makes a rapid transition to values close to 3 
at scales between 40 and 100 Mpc. 
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1 INTRODUCTION 

We expect the Universe to be homogeneous and isotropic on the 
largest scales. Indeed, one of the fundamental postulates in cos- 
mology is that the Universe is spatially homogeneous and isotropic. 
It is this postulate, generally known as the cosmological principle 
(CP)(Einstein 1917), that allows us to approximate the description 
of space-time by a Friedman-Robertson- Walker-Lemaitre (FLRW) 
metric. The standard approach to cosmology assumes that the uni- 
verse can be modelled as a perturbed FLRW universe. The large 
scale structures (LSS) in the universe are believed to have been 
formed due to the collapse of small inhomogeneities present in the 
early Universe (Peebles 1980; Peacock 1999; Padmanabhan 2002; 
Bemardeau et al. 2002). Thus it is of paramount importance to test 
whether the observed distribution of galaxies approaches a homo- 
geneous distribution at large scales. 

The primary aim of galaxy surveys (Colless et al. 2001; York 



et al. 2000; Shectman et al. 1996) is to determine the distribution of 
matter in our Universe. Redshift surveys of galaxies have revealed 
that the universe consists of a hierarchy of structures starting from 
groups and clusters of galaxies to super clusters and interconnected 
network of filaments spread across the observed Universe (van de 
Weygaert & Schaap 2007; Colombi et al. 2000a; de Lapparent et 
al. 1986; Kim etal. 2002). 

Fractal dimensions can be used as an indicator to test whether 
or not the distribution of galaxies approaches homogeneity. One of 
the reasons that make the Fractal dimensions an attractive option 
is that one does not require the assumption of an average density 
(Mandelbrot 1982; Martinez & Saar 2002). Ideally one would like 
to work with volume limited samples in order to avoid corrections 
due to a varying selection function. Redshift surveys of galaxies can 
be used to construct such sub-samples from the full magnitude lim- 
ited sample but this typically leads to a sub-sample that has a much 
smaller number of galaxies as compared to the full sample. This 
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limitation was found to be too restrictive for ttie earliest surveys 
and corrections for the varying selection function were attempted 
in order to determine the scale of homogeneity: for example see 
Bharadwaj, Gupta, & Seshadri (1999). With the large surveys avail- 
able today, this limitation is no longer very serious. Fractal dimen- 
sions are computed for the given sample or sub-sample and the 
scale beyond which the fractal dimension is close to the physical 
dimension of the sample is identified as the scale of homogeneity. 
We expect that at scales larger than the scale of homogeneity, any 
fluctuation in density are small enough to be ignored. Thus at larger 
scales, CP can be assumed to be valid and it is at these scales that 
the FLRW metric is a correct description of the Universe. 

Fractal Dimension is defined in the mathematically rigorous 
way only for an infinite set of points. Given that the observational 
samples are finite, there is a need to understand the relation between 
the fractal dimension and the physical dimension for such samples. 
In this paper, we compute the expected fractal dimension for a fi- 
nite distribution of points (see e.g., (Borgani et al. 1993; Borgani 
& Murante 1994)). The early work on these effects has focused on 
small scales where the amplitude of clustering is large. In this work, 
we calculate fractal distribution for a uniform distribution, as well 
as for a weakly clustered distribution of a finite number of points. 
This is of interest at larger scale where fractal dimensions are used 
as a tool to find the scale of homogeneity. 

Catalogues of different extra-galactic objects have been stud- 
ied using various statistical methods. One of the important tools 
in this direction has been the use of two point correlation function 
^(r) (Peebles 1980) and its Fourier transform the power spectrum 
P{k). We have precise estimates of ^(r) (Kulkarni et al. 2007; Ross 
et al. 2006; de Lapparent & Slezak 2007) and the power spectrum 
P{k) (Cole et al. 2005; Percival et al. 2007b) from different galaxy 
surveys. Different measurements appear to be consistent with one 
another once differences in selection function are accounted for 
(Cole et al. 2006) (but also see Sanchez & Cole (2007)). On small 
scales the two point correlation function is found to be well de- 
scribed by the form 

e« = (?)^ 0) 

where 7 = 1.75 ± 0.03 and r,) = 6.1 ± 0.2 h-^Mpc for the 
SDSS (Zehavi et al. (2002)) and 7 = 1.67 ± 0.03 and ro = 
5.05 ± 0.26 ft-^Mpc for the 2dFGRS (Hawkins et al. (2003)). Re- 
cent galaxy surveys have reassured us that the power law behaviour 
for 5(r) does not extend to arbitrary large scales. The breakdown 
of this behaviour occurs at r > 16/i^^Mpc for SDSS and at 
r > 20/i"^Mpc for 2dFGRS, which is consistent with the dis- 
tribution of galaxies being homogeneous at large scales. A note of 
caution here is that though the ^(r) determined from redshift sur- 
veys is consistent with the universe being homogeneous at large 
scales in that |^(r) | <C 1 at large r, it does not actually imply that 
the universe is homogeneous. This is because the two point corre- 
lation given by, 

^(r) =< S{x + r)S{x) > (2) 
where 

5i.) = (3) 
P 

presupposes that galaxy distribution that we are analysing is homo- 
geneous on the large scales of our survey. This is implicit in the fact 
that p, which is assumed to be the spatial average density of matter 
in the universe, is computed by averaging the density from within 
the survey volume. Of course, it may be possible to demonstrate 



that the survey is a fair sample of the universe by showing that the 
values of p derived from sub-samples of the survey are consistent 
with each other, or that the value of p computed at different scales 
converges to a definite value at scales much smaller than the size of 
the survey. To verify and hence validate the cosmological principle, 
it is useful to consider a statistical test which does not presuppose 
the premise being tested. In this paper we consider one such test, 
the "multi-fractal analysis" and apply it to distribution of particles 
in random as well as clustered distributions. 

Fractal dimension, which is generally a fractional number, is 

characterised by the scaling exponent. In most physical situations, 
we need to use a set with an invariant measure characterised by a 
whole spectrum of scaling exponents, instead of a single number. 
Such a system is called a multi-fractal and we need to do a multi 
fractal analysis of a point set to study the system. 

Various groups have used the concept of fractals to analyse 
catalogues of extra-galactic objects. See Jones et al. (2005) for an 
excellent review of quantitative measures used for describing dis- 
tributions of points. Based on the scale invariance of galaxy clus- 
tering, Pietronero (1987) suggested that the distribution of galaxies 
is a fractal to arbitrarily large scales. In a later analysis of different 
samples of galaxies Coleman & Pietronero (1992) obtained results 
consistent with this argument. On the other hand Borgani (1995) 
showed that the distribution is a fractal only on small scales and 
on large scales there is a transition to homogeneity. If the distri- 
bution of galaxies is fotmd to be a fractal then the average num- 
ber of galaxies in a volume of radius r centred on a galaxy should 
scale as r'', where d is the fractal dimension. Hence the number 
density of neighbouring galaxies would go as p = ^-''^^ in a D di- 
mensional distribution. This, when calculated for higher values of r 
will show a decrease from that of lower scales. This effect led Sylos 
Labini et al. (1998) to believe that the value of correlation length 
To (eq.l) increases with the increase in size of the sample. How- 
ever this interpretation is not supported by volume limited samples 
of various galaxy redshift surveys (Benoist et al. 1996; Martinez 
et al. 2001). A number of authors (Cappi et al. 1998; Hatton 1999; 
Best 2000; Amendola & Palladino 1999; Baryshev & Bukhmastova 
2004; Vasilyev et al. 2006; Sylos Labini et al. 2007) have shown the 
distribution of galaxies to be a mono-fractal up to the largest scales 
that they were able to analyse. On the other hand homogeneity has 
been seen at large scale in other analyses (Guzzo 1997; Bharadwaj, 
Gupta, & Seshadri 1999; Martinez 1999; Kurokawa, Morikawa, & 
Mouri 2001; Pan & Coles 2000; Hogg et al. 2005; Yadav et al. 
2005). The best argument in favour of large scale homogeneity 
stems from the near isotropy of radio sources or background ra- 
diation in projection on the sky (Wu, Lahav, & Rees 1999). 

The aim of this paper is to calculate the fractal dimension for a 
distribution of finite number of points which are distributed homo- 
geneously as well as for those which are weakly clustered. For this 
purpose we use the multi-fractal analysis to study the scaling be- 
haviour of uniform as well as weakly clustered distributions in turn 
finding the relationship between the fractal dimension and the two 
point correlation function. We find deviations of fractal dimension 
Dq from the D arising due to a finite number of points for a random 
distribution with uniform density, these deviations arise due to dis- 
creteness. In this case we can relate the deviation (of Dq from D) 
to the number density of points. We further show that for a distribu- 
tion of points with weak clustering, there is an additional deviation 
of Dq from D. This deviation can be related to the two point corre- 
lation and the intuitive relation between the ampUtude of clustering 
and deviation from homogeneity is given a quantitative expression. 
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We then apply the derived relation to cosmology and compute the 
expected deviations in a model that fits most observations. 

A brief outline of the paper is as follows. In §2 we describe 
the method of analysis, §3 contains results and discussion with the 
conclusions in S4. 



2 FRACTAL DIMENSIONS 

Fractal dimension is the basic characterisation of any point distri- 
bution. There are many different methods that can be used to calcu- 
late the fractal dimension. Box counting dimension of fractal dis- 
tribution is defined in terms of non empty boxes N{r) of radius r 
required to cover the distribution. If 



D2ir) 



Slog G2{r) 



(8) 



N{r) oc r°'' 



(4) 



we define _D[, to be the box counting dimension. One of the difficul- 
ties with such an analysis is that it does not depend on the number of 
particles inside the boxes and rather depends only on the number of 
boxes. As such it provides limited information about the degree of 
dumpiness of the distribution and is a purely geometrical measure. 
To get more detailed information on clustering of the distribution 
we use the concept of correlation dimension. Instead of using the 
formal definition of correlation dimension, which demands that the 
number of points in the distribution approach infinity, we choose a 
working definition which can be applied to a distribution of a finite 
number of points. Calculation of the correlation dimension requires 
the introduction of correlation integral given by 



C2{r) = 



NM 



(5) 



Here we assimie that we have N points in the distribution and we 
have M cells centred on a fraction of these points. In general the 
number of points and cells are different as one cannot use points 
near the edge of the sample where a sphere of radius r is not com- 
pletely inside the sample. ni(r) denotes the number of particles 
within a distance r from a particle at the point i: 



Mr) = ^Qir- I Xi -Xj I) 



(6) 



where 6(a;) is the Heaviside function. 

For the purpose of the analysis in this paper, it will be useful to 
define C2 in terms of the probability of finding particles in a sphere 
of radius r. We define correlation integral C2 as. 



N 

C2(r) = l^nP(n;r,iV) 

n=0 



(7) 



where P{n; r, A'') is the normalised probability of getting n out of 
A'^ points as neighbours inside a radius r of any of the points. For 
a homogeneous distribution of points, the probability for any point 
to fall within the neighbourhood is proportional to the ratio of the 
volume of the sphere to the total volume of the sample. In such a 
case C2 reduces to the product of the volume of a sphere of radius 
r and the total number of particles, divided by the total volume. 
As the total number and total volume are fixed quantities, C2 for 
a homogeneous distribution of points scales as r° at sufficiently 
large scales. 

The power law scaUng of correlation integral i.e. C2(r) oc 
defines the correlation dimension D2 of the distribution. 



d\ogr 

Depending on the scaling of C2, the value of correlation dimension 
D2 can vary with scale r. For the special case of a homogeneous 
distribution, we see that D2 (r) = D at sufficiently large scales and 
this matches the intuitive expectation that the correlation dimension 
of a homogeneous distribution of points should equal the mathe- 
matically defined fractal dimension for infinite number of points. 

We see that the correlation integral is defined in terms of prob- 
ability of finding n point out of a distribution of N points within a 
distance r. This makes it a measure of one of the moments of the 
distribution. We need all the moments of the distribution to com- 
pletely characterise the system statistically. The multi fractal anal- 
ysis used here does this with the generalised dimension Dq, the 
Minkowski-Bouligand dimension, which is defined for an arbitrary 
q and typically computed for a range of values. It is different from 
Renyi dimension only in the aspect that in this case the spheres 
of radius r have been centred at the point belonging to the frac- 
tal whereas in Renyi dimension the sphere need not be centred on 
the particle in the distribution (See section on Generalised dimen- 
sions in Borgani (1995) for a discussion of the two types of gener- 
alised dimensions.). The definition of generalised dimension Dg is 
a generalisation of the correlation dimension D2. The correlation 
integral can be generalised to define Cq{r) as 

M N 

= = ^E"^"^(-) (9) 

which is used to define the Minkowski-Bouligand dimension 
q — I a log r 

The generaUsed dimension corresponds to the correlation dimen- 
sion for g = 2. The values of Cq and Dg can be related to a com- 
bination of correlation functions for q > 2, with contribution from 
the two-point to the g-point correlation functions for Cg. A multi 
fractal structure, unlike a mono fractal can only be described by the 
full spectrum of Dq. If the fractal in question is a mono-fractal then 
we have Dg = D2 for all q and at all scales. 

By construction, the positive values of q give more weightage 
to regions with a high number density whereas the negative values 
of q give more weightage to under dense regions. Thus we may in- 
terpret Dg for g ^ as characterising the scaling behaviour of the 
galaxy distribution in the high density regions like clusters whereas 
q <€- characterises the scaling in voids. In the situation where the 
galaxy distribution is homogeneous and isotropic on large scales, 
we intuitively expect Dg D = 3 independent of the value of q 
at the relevant scales. 



2.1 Homogeneous Distribution 

In our analysis, we first compute the expected values for Cg and Dg 
for a homogeneous distribution in a finite volume. The volume Vtot 
over which the points are distributed is taken to be much larger than 
volume of spheres (V). The points are distributed randomly and 
we can use the Binomial distribution. The probability of finding n 
points in a sphere of volume V centred on a point, if Vtot contains 
TV particles is: 



(11) 



where p is the probabiUty that a given point (out of N) is located 
in a randomly placed sphere. The probability of finding only one 
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particle in such a sphere is not equal to p, in general. If we place 
a sphere of volume V inside a distribution which is contained in 
volume Vtot then v = -rr— ■ We shall assume in our calculations 

Vtot 

that p ^ 1. The above expression follows as with the cell centred 
on one point, this point is already in the cell and we need to com- 
pute the probability of n — 1 points out of iV — 1 being in the cell. 
For comparison, the probabiUty of finding n particles in a randomly 
placed sphere of volume V is: 

P{n)=[^^)p-(l-pr- (12) 

The average number of points in a randomly placed sphere is 
N = Np and we assume that this is much larger than unity. Thus 
we work in the situation where 1 <C Np <C N. Moments of the 
distribution for cells centred at points can be related to moments 
for randomly placed cells. 

= ^(n-l + ir(^:^l)p"-(l-p)--" 

- Y.^n-ir[l-_l)p-Hi-pr- 



+Y,rn{n-ir-^[l2l)p"-\i-py 



(13) 



The subscript p on the angle brackets denotes that the average is for 
cells centred on points within the distribution. A specific applica- 
tion of the above expression is to compute the average number of 
points in a spherical cell. The average number of points in a sphere 
centred at a point is 1 + (iV - l)p ~ iVp + 1 = iV + 1. The dif- 
ference between the two expressions arises due to fluctuations that 
are present in an uncorrelated distribution of points. 

The generalised correlation integral can now be expressed in 
terms of the moments of this probability distribution. In the limit 
1 <C Np <^ N we can write down a leading order expression for 
the generalised correlation integral for q > 1 as: 



NC^{r)c^N''-' + 



(g-l)(g-2) - 



N''-'' + {q-l)N''-'' + - ■ -(14) 



Here we have ignored terms that are of lower order in iV and terms 
of the same order in N with powers of p multiplying it. (See Ap- 
pendix A for a detailed discussion on how we arrived at this expres- 
sion.) The Minkowski-Bouligand dimension corresponding to this 
is: 

{q-2)D D 



(15) 



2 N N 

to the same order. The last two terms in the intermediate expres- 
sion for Dq(r) have a different origin: the first of the two terms 
arises due to fluctuations present in a random distribution and the 
second term arises due to the cells being centred at points within 
the distribution and this leads to weak clustering. A few points of 
significance are: 

• We do not expect Dq ( r) to coincide with the D even if the dis- 
tribution of points is homogeneous. Thus the benchmark for a sam- 
ple of points is not D but Dq (r ) given above, and if the Minkowski- 
Bouligand dimension for a distribution of points coincides with 
Dq{r) then it may be considered as a homogeneous distribution 
of points. 

• The correction due to a finite size sample always leads to a 
smaUer value for Dq (r) than the D. 



• The correction is small if A'^ ^ 1, as expected. The correction 
arises primarily due to discreteness and has been discussed by Bor- 
gani (1995). The major advantage of our approach is that we are 
able to derive an expression for the correction. 



2.2 Weakly Clustered Distribution 

We now consider weakly clustered distributions of points. In this 
case the counts, for spheres whose centres are randomly placed and 
for those centres are placed on the points in the distribution differ 
by a significant amount. Also there is no simple way of relating the 
two and hence we cannot use the approach we followed in the pre- 
vious subsection for estimating the generalised correlation integral. 

In order to make further progress, we note that we can always 
define an average density for a distribution of a finite number of 
points in a finite volume. This allows us to go a step further and also 
define point correlation functions. It is well known that this can 
be used to relate the generalised correlation integral with n— point 
correlation functions, e.g., see Borgani (1995). We shall show be- 
low that it is possible to simplify this relation considerably in the 
limit of weak clustering. We can show that the correlation integral 
may be written as follows (see Appendix B for details). 



iVC,(r) 



N" 



jq ~l)iq~ 2) q(g-l) - 
2N 2 ^ 



+ o(r)+o(f)+o(i,) 



(16) 



Here we have used the assvmiption that |^| < 1 and that higher 
powers of ^ as well as higher order correlation functions can be ig- 
nored when compared to terms of order ^ and 1/iV. This assump- 
tion is over and above the Umit 1 -C Np -C N. The first two terms 
on the right hand side of this equation are same as the first two 
terms in the expression for Cq that we derived for a homogeneous 
distribution of points. The third term encapsulates the contribution 
of clustering. This differs from the last term in the corresponding 
expression for a homogeneous distribution as in that case the "clus- 
tering" is only due to cells being centred at points whereas in this 
case the locations of every pair of points has a weak correlation. 
It is worth noting that the highest order term of order O (f^) has 
a factor O (g^) and hence can become important for sufficiently 
large q. This may be quantified by stating that <C 1 is the more 
relevant small parameter for this expansion. 

The Minkowski-Bouligand dimension for such a system can 
now be expressed in the form 



D 



= D- 



D{q-2) 

2N 
D{q-2) 



+ 1 



2N 

D-(ADq)f, 



2 dlog r 



(17) 



It is interesting to see that the departure of Dq from D due to a finite 
sample and weak clustering is given by distinct terms at the leading 
order. This expression allows us to compute Dq for a distribution 
of points if the number density and ^ are known. 

Recall that D is the mathematically defined fractal dimension 
for an infinite set of points with a homogeneous distribution. We 
have already noted some aspects of the correction due to a finite 
number of points in the previous section, here we would like to 
highlight aspects of corrections due to clustering. 
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Figure 1. Our model is compared with tlie observed Fractal dimensions 
for a random distribution of points in the special case of the multinomial 
model. ADq is shown as a function of {A'^) = N for q = 2 and 6 for this 
distribution. ADq measured from a realisation are plotted as points, and 
our model is shown as a curve. 



• For hierarchical clustering, both terms have the same sign and 
lead to a smaller value for Dq as compared to D. 

• Unless the correlation function has a feature at some scale, 
smaller correlation corresponds to a smaller correction to the 
Minkowski-Bouligand dimension. The expression given above 
quantifies this intuitive expectation. 

• Note that for g = 2, the expression given here is exact. For 
this case, the contribution of clustering has also been discussed by 
Martinez et al. (1998). 

• If the correlation function has a feature then it is possible to 
have a small correction term {^Dq)^i^^ for a relatively large ^. 
The relation between ^ and {^Dq)^^^ is not longer one to one. 



2.3 Multifractal Multinomial Distribution 

We have applied our method to the multinomial multi fractal model 
discussed in literature (Sec e.g. Martinez & Saar (2002)). The set 
of points for this model can be generated by starting with a square 
and dividing it into four parts. We assign a probability {/»} to each 

of these sub-squares ft = ^ ■ This construction can be con- 
tinued iteratively by dividing each smaller square further and as- 
signing probability by multiplying the corresponding number fi by 
all its ancestors. We performed this construction to L = 8 levels, 
thus getting a 256^ lattice with the measure associated with each 
pixel. For such models the we have an analytical expression for the 
generalised dimension as 

\i=l,fi^O ) 

This expression can be used to check whether our model for finite 
number and correlation work correctly or not. 
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Figure 2. Our model is compared with the observed Fractal dimensions 
for a multinomial fractal with /, = 0.23, 0.27, 0.25, 0.25. AD, = 
Dq — Dq is shown as a function of {N) for q = 6. We have plotted 
ADq measured in the five realisations as points with error bars. The error 
bars mark the extreme values of ADq seen in these realisations whereas 
the central point marks the average value. Predictions of our model based 
on correlation function measured in these realisations is shown as a thick 
line. This line corresponds to the average value of ^ and f measured in sim- 
ulations, and thin lines mark the predictions of our model based on extreme 
values seen in these simulations. 




150 



r (h 'Mpc) 

Figure 3. The linearly extrapolated two point correlation function is shown 
as a function of scale for the best fit model for WMAP-3 (see text for de- 
tails). This has been used, for calculation of (ADg);,;^^. 
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Figure 4. Estimated deviation of the Minkowski-Bouligand dimension from 
the physical dimension is shown here for two types of populations. In black 
we have plotted AD, for an unbiased sample of points, distributed in red- 
shift space with the real space correlation function as shown in Figure 1 . 
The solid curve shows AD2, whereas AD4 and ADg are shown with a 
dashed curve and a dot-dashed curve respectively. Curves in red con'espond 
to an LRG like population with a number density of 5 X 10~^ h^^Mpc^ 
and a linear bias of 2. 




Figure 5. This figure shows the components of AD, for an LRG like popu- 
lation of galaxies for g = 4. This value of q was chosen as the contribution 
of a finite number of galaxies does not vanish in this case. The solid line 
shows AD4, the dashed line shows the contribution of clustering to ADq 
and the dot-dashed line is the correction due to a finite number of galaxies. 
Clearly, the connection due to clustering is the dominant reason for departure 
of Dq from D. 



We have calculated the generalised dimension for this model 
taking three different combination of /,; . In one of the cases all four 
fi's are 0.25 so that the distribution is homogeneous. In this case 
the expected Dq = 2 for all q using the above expression. Our 
model in this case gives a scale dependent correction to this due to 
a finite number of particles. Figure 1 shows ADq as a function of 
(N) = N for q = 2 and 6 for this distribution. AD, measured 
from a realisation are plotted as points, and our model is shown as 
a curve. It is clear that for A'^ < 10"^, there is a visible deviation of 
Dq from the expected value and that our model correctly estimates 
this deviation. 

In other case we present here, the fi's are close to 0.25 but not 
exactly equal to 0.25, thus giving us a slightly clustered distribu- 
tion. We use fi = 0.23, 0.27, 0.25, 0.25 and we generated give re- 
alisations on this fractal. In this case, the expected Dq — 1.986 for 
5 = 6. As this differs from D — 2, the difference in our model must 
come from clustering presented in this fractal. We generated five re- 
alisations of this fractal. Figure 2 shows AD, = Dq — Dq as 
a function of {N) for g = 6, where Dq e^p follows from Eqn.(18). 
We have plotted ADq measured in the five realisations as points 
with error bars. The error bars mark the extreme values of ADq 
seen in these realisations whereas the central point marks the aver- 
age value. Predictions of our model (Eqn.(17)) based on correlation 
function measured in these realisations is shown as a thick line. 
This line corresponds to the average value of ^ and (, measured 
in simulations, and thin lines mark the predictions of our model 
based on extreme values seen in these simulations. At (A'^) ^ 100, 
where the effect of a finite number is dominant, our model matches 
the measured ADq very well. At (A'^) ^ 100 where the effect 
of clustering is dominant we again find a good match between 
the model and measured values. It is significant that at very large 
(A'^), we model the deviation of Dq from D = 2 correctly. How- 
ever, there appears to be a mismatch in the transition region around 
(A') ~ 100. On inspection, we find that ^ — ^ has an oscillatory 
behaviour up to this scale and the discrepancy corresponds to the 
last oscillation. At the scale of maximum discrepancy, ^ — f — 0.05 
and perhaps we cannot ignore values of this order 

In summary we can say that our model works very well for the 
multinomial model and we find that the correction due to cluster- 
ing as well as a finite number of points matches with the observed 
behaviour of Dq. 



3 DISCUSSION 

The expressions derived in the previous section have a rich struc- 
ture and we illustrate some of the features here. We would also like 
to discuss the application to the concordance model here. The two 
point correlation function for the model that fits best the WMAP-3 
data (Spergel et al. 2007) is shown in Figure 3. We have used the 
flat ACDM model with a power law initial power spectrum that best 
fits the WMAP-3 data here. Parameters of the model used here are: 
Ho = 73 km/s/Mpc, fit/i^ = 0.0223, Qch^ = 0.105, = 0.96 
and T = 0.088. For this model, as = 0.76. The two point corre- 
lation has been shown at large scales where the clustering can be 
assumed to be weak. The most prominent feature here is the peak 
near 100 Mpc. This peak is caused by baryon acoustic oscillations 
(BAO) prior to decoupling; see, e.g., Eisenstein & Hu (1998). Apart 
from this peak, the two point correlation function declines from 
small scales towards larger scales at length scales shown here. 

All observations of galaxies are carried out in redshift space. 
Therefore we must use the correlation function in redshift space. At 
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large scales, redshift space distortions caused by infall lead to an 
enhancement of the two point correlation function. The enhance- 
ment is mainly along the line of sight but the angle averaged two 
point correlation function is also amplified by some amount (Kaiser 
1987). 

Further, we must also take into account the bias in the dis- 
tribution of galaxies while using the correlation function shown in 
Figure 3. This has been discussed by many authors (Kaiser 1984; 
Bardeen et al. 1986; Brainerd & Villumsen 1994; Fry 1996; Mo & 
White 1996; Bagla 1998a,b; Dckcl & Lahav 1999). At large scales, 
we may assume that the linear bias factor h is sufficient for describ- 
ing the redshift space distortions and clustering. 

Lastly, we should mention that we are working with the lin- 
early extrapolated correlation function at these scales even though 
there is some evidence that perturbative effects lead to a slight shift 
in the location of the peak in ^ (For example, see Smith et al. 
(2007)). The only change caused by such a shift in the location of 
the peak is to in turn shift the scale where there appears to be a tran- 
sition from large values of ADq towards small and constant values. 
As the shift does not alter our key conclusions, we will ignore such 
effects in the following discussion. 

We plot the expected departure of Dq from D for an unbi- 
ased sample of galaxies in Figure 4. We assumed that typical (L*) 
galaxies have an average number density of 0.02 h^Mpc"^ and 
a bias factor of unity. AD^ for such a population is shown as a 
function of scale by black curves for 5 = 2, 4 and 6. Red curves 
show the same quantity for a sample of galaxies similar to Lu- 
minous Red Galaxies (LRGs). We used a bias factor b = 2 and 
a number density of 5 x 10~^ h^'Mpc"^ that is representative of 
such a population. For example, see Percival et al. (2007a). ADq 
is negative at all scales shown here, as expected from the expres- 
sion (see Eqn.(17)) for hierarchical clustering. The behaviour of 
ADq as a function of scale has two distinct regimes on either side 
of 100 h~^Mpc. The magnitude of ADg increases rapidly as we 
go from 100 h~'^Mpc towards smaller scales. At scales larger than 
100 h^^Mpc, ADq either stays constant or decreases at a very slow 
rate. The behaviour of ADq around 100 h^^Mpc is dictated largely 
by the BAO peak in ^ at this scale. Although there is no peak in ^, 
9^/9 log r = — 0.5D(^(r) — ^(r)) has a minima and a maxima 
near the scale of the peak in £,{r). This results in a corresponding 
minima and maxima for ADq as the contribution of a finite number 
of galaxies is subdominant at such large scales. We illustrate this in 
Figure 5 where AD4 is plotted for an LRG like sample, and the 
two contributions (from a finite sample and weak clustering) are 
also shown. 

If 5 has a power law form then there are no extrema for 
d^/d log r and the magnitude of both ^ and ADq becomes pro- 
gressively smaller as we get to larger scales. There is a one to one 
relation between ^ and ADq for a given model of this type. How- 
ever, a feature like the peak introduced by BAO leads to the non- 
trivial behaviour illustrated in Figure 4. Here we find that Dq can 
be smaller at scales with a larger ^. For example, the scale with the 
local maxima of ^ is very close to the scale with the local minima of 
Dq . The intuitive correspondence of a small ^ implying a smaller 
deviation of Dq from D does not apply in this case. 

The difference between the unbiased galaxy population, and 
an LRG like sample is stark. The LRG like sample has a 
Minkowski-Bouligand dimension that differs from D = 3 by a 
significant amount. The main reason for this difference is the high 
bias factor associated with the LRG population, although a smaller 
number density also makes some difference. Different clustering 
properties for different types of galaxies imply that these will have 



not have the same Minkowski-Bouligand dimension. This has no 
impact on determination of the scale of homogeneity for the uni- 
verse, where we must use unbiased tracers. 

The calculations presented in the previous section allow us to 
estimate the offset of the Minkowski-Bouligand dimension from 
the physical dimension due to weak clustering and a finite sample. 
This has to be accompanied by a calculation of the dispersion in the 
expected values (Szapudi, Colombi, & Bemardeau 1999; Colombi 
et al. 2000b). The natural estimate for the scale of homogeneity is 
the scale where the offset of the Minkowski-Bouligand dimension 
from the physical dimension becomes smaller than the dispersion 
in a sufficiently large survey. Given that the offset is dominated by 
the effect of clustering, we have ADq ~ 0.5Dg(^ — ~ ^(C^C)- 
The offset scales with q. Further, it is apparent that the dispersion 
in ADq must also scale with q. This implies that the requirement 
of dispersion being greater than the offset leads to the same scale 
for all q. This is a very satisfying feature of this approach in that the 
scale of homogeneity does not depend on the choice of q as long as 
the effect of a finite number of points is subdominant. 

Alternatively, we may argue that the scale of homogeneity 
should be identified with the scale above which the variation of 
ADq is very small. While this is an acceptable prescription for 
typical galaxies where AD, < 0.06 at scales above 100 h^^Mpc, 
it does not appear reasonable for an LRG like population. The scale 
of homogeneity for the latter population is clearly much larger than 
100 h-^Mpc. 



4 CONCLUSIONS 

We have studied the problem of the expected value of the 
Minkowski-Bouligand dimension for a finite distribution of points. 
For this piupose, we have studied a homogeneous distribution as 
well as a weakly clustered distribution. In our study, q/N and 
are taken to be the small parameters and the deviation of Dq from 
D is estimated in terms of these quantities. In both cases we find 
that the expected values of the Minkowski-Bouligand dimension 
Dq are different from D for the distribution of points. For generic 
distributions, the value of Dq is less than the dimension D. We 
have derived an expression for Dq in terms of the correlation func- 
tion and the number density in the limit of weak clustering. /; is 
remarkable that Dq < D even for homogeneous distributions. 

We find that ADq = Dq — D is non-zero at all scales for 
unbiased tracers of mass in the concordance model in cosmology. 
For this model ADq is a large negative number at small scales but 
it rapidly approaches zero at larger scales. ADq is a very slowly 
varying function of scale above 100 h~^Mpc and hence this may 
be tentatively identified as the scale of homogeneity for this model. 
A more quantitative approach requires us to estimate not only the 
systematic offset ADq but the dispersion in this quantity. The scale 
of homogeneity can then be identified as the scale where the offset 
is smaller than the expected dispersion. We plan to undertake esti- 
mation of dispersion as the next step. Verifying these results using 
simulated distributions of points is also on the agenda. 

Although we have used the example of galaxy clustering for 
illustrating our calculations, the results as given in Eqn.(15) and 
Eqn.(17) are completely general and apply to any distribution of 
points with weak departures from homogeneity. A detailed deriva- 
tion of the relations presented here, with verification using mock 
distributions of points will be presented in a separate publication, 
where we also expect to highlight other appUcations. 



8 Bagla, Yadav and Seshadri 



ACKNOWLEDGEMENTS 

The authors would hke to thank Prof. K. Subramanian for useful 
comments. JY is supported by a fellowship of the Council of Scien- 
tific and Industrial Research (CSIR), INDIA. TRS thanks Depart- 
ment of Science and Technology, INDIA for financial assistance. 
JY and TRS acknowledge the facilities at the lUCAA Reference 
Centre at Delhi University. Computational work for this study was 
carried out at the cluster computing facility in the Harish-Chandra 
Research Institute (http://cluster.hri.res.in). We would like to thank 
the anonymous referee for useful comments. 



REFERENCES 

Amendola L., Palladino E., 1999, ApJ, 514, LI 
Bagla, J. S. 1998a, MNRAS, 297, 251 
Bagla, J. S. 1998b, MNRAS, 299, 417 

Bardeen J. M., Bond J. R., Kaiser N., Szalay A. S., 1986, ApJ, 
304, 15 

Baryshev Y. V., Bukhmastova Y. L., 2004, AstL, 30, 444 
Benoist, C, Maurogordato, S., da Costa, L. N., Cappi, A., & Scha- 

effer, R. 1996, ApJ, 472, 452 
Bernardeau, R, Colombi, S., Gaztaiiaga, E., & Scoccimarro, R. 

2002, Phys Reps, 367, 1 
Best J. S., 2000, ApJ, 541, 519 

Bharadwaj S., Gupta A. K., Seshadri T. R., 1999, A&A, 351, 405 
Borgani S., Murante G., Provenzale A., Valdamini R., 1993, 

PhRvE, 47, 3879 
Borgani S., Murante G., 1994, PhRvE, 49, 4907 
Borgani S., 1995, Phys Reps, 251, 1 
Brainerd T. G., Villumsen J. V., 1994, ApJ, 431, 477 
Cappi A., Benoist C, da Costa L. N., Maurogordato S., 1998, 

A&A, 335, 779 
Cole, S., et al. 2005, MNRAS, 362, 505 

Cole, S., Sanchez, A. G., & Wilkins, S. 2006, ArXiv Astrophysics 
e-prints, arXiv:astro-ph/061 1 178 

Coleman P H., Pietronero L., 1992, PhR, 213, 311 

CoUess, M., et al. 2001, MNRAS, 328, 1039 

Colombi, S., Pogosyan, D., & Souradeep, T. 2000a, Physical Re- 
view Letters, 85, 5515 

Colombi S., Szapudi I., Jenkins A., Colberg J., 2000b, MNRAS, 
313,711 

Dekel A., Lahav O., 1999, ApJ, 520, 24 
Dressier, A. 1980, ApJ, 236, 351 

Einstein, A. 1917, Sitzungsberichte der Koniglich PreuBischen 
Akademie der Wissenschaften (Berlin), Seite 142-152., 142 

Eisenstein, D. J., & Hu, W. 1998, ApJ, 496, 605 

Fry J. N., 1996, ApJ, 461, L65 

Guzzo L., 1997, NewA, 2, 517 

Hatton S., 1999, MNRAS, 310, 1128 

Hawkins E., et al., 2003, MNRAS, 346, 78 

Hogg D. W., Eisenstein D. J., Blanton M. R., Bahcall N. A., 
Brinkmann J., Gunn J. E., Schneider D. P., 2005, ApJ, 624, 54 

Kaiser N., 1984, ApJ, 284, L9 

Jones B. J., Martinez V. J., Saar E., Trimble V., 2005, RvMP, 76, 
1211 

Kaiser, N. 1987, MNRAS, 227, 1 
Kim, R. S. J., et al. 2002, A J, 123, 20 

Kulkarni, G. V., Nichol, R. C, Sheth, R. K., Seo, H.-J., Eisenstein, 

D. J., & Gray, A. 2007, MNRAS, 378, 1196 
Kurokawa T, Morikawa M., Mouri H., 2001, A&A, 370, 358 



Lahav O., 2002, CQGra, 19, 3517 

de Lapparent, V., Geller, M. J., & Huchra, J. P. 1986, ApJ Letters, 
302, LI 

de Lapparent, V., & Slezak, E. 2007, A&A, 472, 29 
Mandelbrot, B. B. 1982, The Fractal Geometry of Nature, San 

Francisco: Freeman, 1982, 
Martinez V. J., 1999, Sci, 284, 445 

Martinez V. J., Pons-Borderia M.-J., Moyeed R. A., Graham M. J., 
1998, MNRAS, 298, 1212 

Martinez, V. J., Lopez-Marti, B., & Pons-Borderia, M.-J. 2001, 
ApJ Letters, 554, L5 

Martinez, V. J., & Saar, E. 2002, Statistics of the Galaxy Distri- 
bution, Published by Chapman & Hall/CRC, Boca Raton, ISBN: 
1584880848, 

Mo H. L, White S. D. M., 1996, MNRAS, 282, 347 
Padmanabhan, T. 2002, Theoretical Astrophysics, by T. Padman- 
abhan, pp. 638. ISBN 0521562422. Cambridge, UK: Cambridge 

University Press, October 2002., 
Pan J., Coles P, 2000, MNRAS, 318, L51 

Peacock, J. A. 1999, Cosmological Physics, by John A. Peacock, 
pp. 704. ISBN 052141072X. Cambridge, UK: Cambridge Uni- 
versity Press, January 1999., 

Peebles P. J. E., 1980, Large Scale Structure in the Universe, 
Princeton University Press, Princeton, USA 

Percival, W. J., Cole, S., Eisenstein, D. J., Nichol, R. C, Peacock, 
J. A., Pope, A. C, & Szalay, A. S. 2007a, MNRAS, 381, 1053 

Percival, W. J., et al. 2007b, ApJ, 657, 645 

Pietronero L., 1987, Physica A, 144, 257 

Ross, N. P., et al. 2006, ArXiv Astrophysics e-prints, arXiv:astro- 
ph/0612400 

Sanchez, A. G., & Cole, S. 2007, ArXiv e-prints, 708, 
arXiv:0708.1517 

Shectman, S. A., Landy, S. D., Oemler, A., Tucker, D. L., Lin, H., 
Kirshner, R. P, & Schechter, R L. 1996, ApJ, 470, 172 

Smith, R. E., Scoccimarro, R., & Sheth, R. K. 2007, ArXiv Astro- 
physics e-prints, arXiv:astro-ph/0703620 

Spergel, D. N., et al. 2007, ApJ Supplement Series, 170, 377 

Sylos Labini, F., Montuori, M., Pietronero, L. 1998, Phys Reps, 
293, 61 

Sylos Labini, F., Vasilyev, N. L., & Baryshev, Y. V. 2007, A&A, 
465, 23 

Szapudi I., Colombi S., Bernardeau R, 1999, MNRAS, 310, 428 
van de Weygaert, R., & Schaap, W. 2007, ArXiv e-prints, 708, 

arXiv:0708.1441 
Vasilyev, N. L., Baryshev, Y. V., & Sylos Labini, R 2006, A&A, 

447, 431 

Wu K., Lahav O., Rees M., 1999, Nature, 397, 225 
Yadav J, Bharadwaj S., Pandey B, Seshadri T.R., 2005, MNRAS, 
364, 601 

York, D. G., et al. 2000, A J, 120, 1579 
Zehavi I. et al., 2002, ApJ,571,172 



APPENDIX A: HOMOGENEOUS DISTRIBUTION 

The probability for a point to be found in a sphere of volume V 
enclosed within a total volume of Vtot is p = V/Vtot- In an uncor- 
rected distribution of points, the probability for all the points are 
independent of one another and hence the probability of n out of 
A'^ points falling in a sphere of volume V is: 

P(n,Ar)= (^)p"(l-p)^-" (Al) 
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The distribution function determined by the probability function 
P{n, N) is called a Binomial distribution. As discussed in the text, 
the probability distribution for occupation number of cells centered 
at points is different but moments of that distribution can be related 
to the moments of the probability distribution given above at the 
required level of accuracy. We require that the description should 
be accurate to first order in 1/iV. 

The moment generating function for the Binomial distribution 
is given by 



G{t) = E e'" (^) - = ipe' + i-py 



(A2) 



The mth moment of the distribution can then be calculated by dif- 
ferentiating G with respect to t, doing this m times and then setting 
t to zero. The mth derivative of G{t), at t = can be written as: 



G^'^\t)\t=0=^Hr, 



1=1 



(N-l) 



where H satisfies the following recurrence relation 

Hm,l = lHm-1,1 + Hm-l,l-l 



(A3) 



(A4) 



with Hi^i = 1 and Hm,i = for i > m and I < 1. It can be shown 
that this implies Hi^i = 1 for all I. 

The mth moment of the distribution is given by: 



\t)\ 



Hm,l 



m 



{N-l) 



On the face of it this expression has a large number of terms for 
1 and is difficult to analyse. But if we assume that p <C 1 and 
N = Np ^ 1 then we can rewrite the expression in the following 
form: 



(A/""") 



(N- 



H„ 



(N-m + iy. 
+ 0{pN"'-^) 



(A5) 



Where we have retained terms up to 0{N"^^^) and have dropped 
all other terms. We have also used the recurrence relation and find 
that Hm,m-i = m{m — l)/2. 

We can now write the correlation integral as: 



NC,{r) = {J\f^-') + {q-l){Af' 
~ N' 



1 ^ (g-l)(g-2) 



+(g - l)iV«-' + • • • (A6) 
The Minkowski-Bouligand dimension is then given by 
1 d log Cg{r) 



q-1 
1 



'logr 



d 



q — 1 d log r 



log 



N 



q-l 



1 + 



(g-l)(9-2) 
2N 



N 



D 1- 



(9-2) 
2N 



(A7) 



where D is the dimension of the space in which particles are dis- 
tributed. In this calculation, we have again made use of the fact that 
iV » 1 and that it scales as the Dth power of scale r for a random 
distribution. 



APPENDIX B: WEAKLY CLUSTERED DISTRIBUTION 

In this section we will derive the form of the correlation integral for 
a weakly clustered distribution of points. Consider a sphere of vol- 
ume V contained within the sample of volume Vtot- We follow the 
approach given in §36 of Peebles (1980) for estimating the correla- 
tion integral. In order to estimate the correlation integral, we divide 
the sphere into infinitesimal elements such that each element con- 
tains at most one point. This is a useful construct as = n; for 
all m > 0, where rn is the occupancy of the ith infinitesimal vol- 
ume element. If the occupancy of the ith volume element is n; then 
we have: 



Therefore the mean count is: 



The mth moment is then: 



(AT") = 



(Bl) 



(B2) 



(B3) 



If the sphere is centred at a point in the distribution then the aver- 
ages are denoted as < A/"™ >p, this is what we are interested in for 
the purpose of computing the correlation integral. 




(B4) 



Averaging the sum raised to a positive integral power will lead to 
averaging of terms of type riiTij, ni,njnk, etc. and the expression 
for such terms involves n-point correlation functions, n being the 
number of terms being multiplied. With this insight, we can write 



Ui ) + m 



Em-l 
n, r 



m(m — 1) 



E 



riins ■ ■ ■ Um 



+ 



E 



711712713 ...Um 



+ 



E-i)^ + m(E 

-I- ( E'^l'^2713 ■ • ■ 



p 



7ll7l3 . . . rim 



(B5) 



Here terms in the expansion correspond to « = j = fc = • • • for 
the first term, only one of the indices differing from the rest for the 
second term and so on. The last term in this series is for all the 
m indices different. We have shifted the notation in order to write 
down the explicit form for arbitrary m. 
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For a weakly clustered set of points with statistical isotropy ^ jj _ (q ~ 2) ^ q dS, 

and homogeneity, we can safely assume that the magnitude of 2N 2 91og r 

the two point correlation function is small compared to unity, and D {q — 2) Dq /-, ^ , ^\ 

higher order correlation functions are even smaller. Further, we — D — \(,{r) — (,{r)j (Bll) 

continue to use the assumption that iV ^> 1 and hence wc need ™ ■ ■ ..i. • j 

_ . This IS the required expression, 

to retain only terms of the highest and the next highest order m 

this parameter. Thus we have two small parameters in the problem; 
1^1 and 1/iV and our task is to compute the leading order terms in 
(7V'")p. Here ^ is the two point correlation function. 

It can be shown that the loading order contribution comes from 
the last term in the series in Eqn.(B5), and the next to leading or- 
der contribution is from the last two terms. We should note that 
these terms also contain several terms that are smaller than the lead- 
ing and next to leading order within them. The foremost contribu- 
tion comes from the uncorrelated component of the last term, i.e., 
/ dVidV2 ■ ■ ■ dVm = iV™. The integral here is over m inde- 
pendent volumes and n is the average number density. The next 
contribution comes from components of this term that include the 
effect of pairwise correlations. As there are m distinct points, the 
number of distinct pairs is m{m + l)/2 and the term has the form: 

N-^^^^^ar) (B6) 

where r is the radius of the sphere with volume V and ^ is given 
by: 



r 



^(r) = ^ / x'^{x)dx . (B7) 



It can be shown that all other components of the last term involve 
higher order correlation functions, or higher powers of ^. Further, 
it can be shown that the contributions that contain only a single 
power of ^ from other terms in the series in Eqn.(B5) contain a 
lower power of N. Lastly, it can be shown that the only other term 
that we need to take into accoimt comes from the penultimate term 
in the series in Eqn.(B5). The uncorrelated component of this term 
is: 

m{m- ^g^^ 

Thus we have for the mth moment of the counts of neighbours: 
(A/--)^ = Arm I m{m+l) ^^^^^^ m(rn- 

^ ^^rm{m + l)^^m{m_iy^ ^^^^ 



The largest term of order O arises from the contribution of 
correlated triangles in the last term of Eqn.(B5). The number of 
triangles scales as and hence can become important for suffi- 
ciently large m. This may be codified by stating that <C 1 is 
the more relevant small parameter. 

The correlation integral can be written as 

iVQM:.iV'-(l + ii^C-+ii^||^) (BIO) 

From this we can calculate the Minkowski-Bouligand dimension 
using equation A7 as 

D(r) - ' glogg.(0 
"^'^^ ~ (9-1) aiogr 



